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The third enactment of Cambridge 
Healthtech Institute's Macroresults 
through Microarrays meeting was held 
in Boston (MA, USA) from 29 April- 
1 May 2002. The subtheme of this year's 
meeting was 'advancing drug discov- 
ery', a widely touted application for 
array technology. 

The volution of microarrays 
If you were asked 'Who first conceived 
f the idea of microarrays', who would 
come t mind? Mark Schena perhaps, 
first author of the seminal 1995 paper 
on cDNA arrays [1]? Maybe Pat Brown, 
Schena's then supervisor? Or perhaps 
Stephen Fodor, the primary driver 
behind Affymetrix's (http://www. 
affymetrix.com) oligonucleotide-based 
platform [2]. Brits might even chant the 
name f Ed Southern [3]. Well, accord- 
ing to R ger Ekins (University College 
London Medical School; http://www. 
ucl.ac.uk/medicine/) all these answers 
would be wrong. It was in fact Ekins 
and his colleagues who first conceived 
of and patented 'a new generation of 
ultrasensitive, miniaturized assays for 
protein and DNA-RNA measurement 
based on the use of microarrays' in the 
mid 1980s [4]. The concept and poten- 
tial of array technology was more fully 
described in a later publication, in 
which Ekins et al. [5] concluded that an- 
tib dy microspots of -50 u.m 2 could be 
achieved, and that as many as 2 million 
different immunoassays could, in prin- 
ciple, be accomm dated on a surface 
area of 1 cm 2 . 

Technological innovation 
In practice, it took a different biol gical 
m lecule (DNA), a different research 



group, and a leap into microfabri- 
cation technology to even begin 
approaching these kinds of densities 
[Affymetrix patent 6045996 talks of 
one million spots cm- 2 ]. Of course, 
advancing technology is one of the 
driving engines behind the genomics 
juggernaut, and we are already seeing 
'4th generation' machines for fab- 
ricating DNA chips. If the company 
representatives at this meeting are to 
be believed (and their cases seemed 
strong), spotting is out, and in situ 
fabrication of oligonucleotide-based 
'iterative custom arrays' is in. Whether 
you go with the Combimatrix's (http:// 
www.combimatrix.com) electrochemi- 
cally directed synthesis and detection 
system, febit's (http://www.febit.com) 
Geniom® technology, or Nimblegen's 
(http://www.nimblegen.com) Maskless 
Array Synthesizer technology is a 
matter of personal choice. However, 
each of these machines provides the 
flexibility to design variable length 
oligonucleotide probes from se- 
quences inputted by the user, and then 
perform in situ synthesis of an array. 
Each system also boasts unique advan- 
tages. For example, Combimatrix's 
biological array processor is a semi- 
conductor coated with a 3D layer 
of porous material in which DNA, 
RNA, peptides or small molecules 
can be synthesized or immobilized 
within discrete test sites, while febit's 
Ceniom One* is a fully integrated 
gene-expressi n analysis system with 
minimal user hands-on time - the 
probe sequences are programmed, the 
RNA samples inserted, and the gene 
expression data is pumped out a few 
hours later. 



Cell- and tissue-based arrays 
Array technology is in most people's 
minds firmly linked with gene-expression 
profiling. Fewer are aware that cell- and 
tissue-based arrays have been devel- 
oped, and how they can provide 
a vital extra dimension to research. In 
support of this, Barry Bochner gave an 
update on the cell-based array system 
that Biolog (http://www.biolog.com) 
has produced for simultaneously mea- 
suring the effects of one gene in the cell 
under thousands of growth conditions 
(see [6] for further details). David Walt 
(Tufts University; http ://www. tufts, 
edu/) is developing single live cell ar- 
rays using optical imaging fiber (OIF) 
technology. An array of microwells is 
fabricated on the face of an OIF at den- 
sities of up to 10 million wells cm-a. 
Cells are then added to the wells and 
disperse at an average of one cell per 
well. Physiological and genetic re- 
sponses of each cell are measured via 
fluorescence produced by reporter 
genes (e.g. /ocZ, gfp. Assays performed 
so far include yeast live or dead cell 
assay, microenvironment pH and 
0 2 measurements, promotor responses 
using the locZ and pnoA reporter genes, 
and protein-protein interactions using 
the yeast two-hybrid system. The main 
advantage of this system is that the cells 
remain alive during the assay, which 
means a real-time timecourse can be 
performed and/or the array passed 
from sample to sample. This would be 
useful in, f r example, the scanning of 
a combinatorial drug library for specific 
physiological effects. 

Tissue arrays ar a useful complemen- 
tary techn I gy to DNA arrays because 
they can be used to help validate and 
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understand the biological and medical 
significance of gene changes discov- 
ered using standard DNA arrays. For 
example; an array of tumor tissues can 
be screened for the protein (using im- 
munohistochemistry), message (using 
in situ hybridization) and copy number 
(using comparative genomic hybridiza- 
tion) of a gene of interest, to determine 
if expression of the gene (or lack 
thereof) is related in any way to sur- 
vival. They can also be used to predict 
the probability of clinical failure of lead 
compounds as a result of toxicity by 
evaluating the distribution of the drug 
targets in normal tissue. Spyro Mousses 
and his co-workers at the National 
Human Genome Research Institute 
(http://www.nhgri.nih.gov/index.html) 
have built such arrays, including a 
multi-tumor array (-5000 specimens, 
and sections from 36 normal and 800 
metastatic tissues) and a normal tissue 
array (76 tissue and 332 cell types). 

The problem with proteins 
It has been said that genomics tells us 
what might happen, transcriptomics 
indicates what should happen, and pro- 
teomics shows what is happening. The 
impact of functional proteomics on 
pharmaceutical R&D is rapidly increas- 
ing, and protein arrays are being used 
increasingly in both basic and applied 
research. Their use lies not only in com- 
parative protein expression and inter- 
action profiling, but also in diagnostics 
and drug discovery. However, an in- 
creasing number of researchers have 
found that protein arrays, like their 
c usins the DNA arrays, present several 
practical obstacles relating to their pro- 
duct! n and use. For example, in using 
Escherichia coii to produce recombi- 
nant eukaryotic proteins from a single 
expression vector, multiple protein 
products are often produced, suggest- 
ing mixes of truncated or otherwise 
altered proteins. There is als the obvi- 
ous concern that the proteins might 
n t be modified in a similar manner t 



eukaryotic systems. Also, an optimal 
method for depositing and binding 
proteins to the selected substrate is 
yet to be determined, as is the best 
way to ensure that they are bound in a 
correctly folded, active conformation. 

Several companies have been address- 
ing these problems. Prolinx (http:// 
www.proiinxinc.com) is one such com- 
pany, and Karin Hughes described their 
Versalinx™ chemistry for producing 
protein, peptide and small-molecule 
arrays. Versalinx™ uses solution-phase 
conjugation followed by immobiliza- 
tion, resulting in functional orientation 
of proteins and peptides on the sub- 
strate surface. It also offers the valuable 
additional benefit of exhibiting low 
non-specific binding. Sense Proteomic 
(http://www.sen5eproteomic.com) is 
also among those addressing these 
problems to develop robust protein 
arrays for drug discovery and clinical 
applications and has developed func- 
tional protein array formats based on 
specific disease tissues. Subtractive hy- 
bridization is used to identify genes 
with altered expression in breast tumor 
and cystic fibrosis compared to normal 
tissue. A high throughput cloning strat- 
egy (COVET™) is then used to produce 
libraries of genes that are tagged, 
cloned, expressed, purified and finally 
immobilized on glass slides. Initial vali- 
dation studies have shown that the vast 
majority of the immobilized proteins do 
indeed retain biological function. 

Stefan Schmidt and his company 
(CPC Biotech; http://www.gpcbiotech. 
de) have moved past the platform devel- 
opment stage and, with their focus 
firmly on drug discovery, are currently 
developing kinase-profiling arrays. 
Kinases are important targets for phar- 
maceutical drug discovery and therapy, 
and GPC's aim is to simultaneously de- 
tect multiple kinases, obtain activity pro- 
files f r different cell types, or analyze 
the ability f drug candidates to inhibit 
kinase activity. To do this, recombinant 
kinase substrates are immobilized n 



membranes, incubated with purified 
kinase, and the- substrates measured for 
the degree of phosphorylation. 

Summary 

Meetings like this, packed with exciting 
discoveries and intriguing and interest- 
ing innovation, heavily emphasize the 
pace at which biotechnology is advanc- 
ing, to the extent that the number of 
options for genomic and proteomic re- 
searchers can become overwhelming. 
Although data analysis is perhaps the 
greatest current concern for array users, 
an increasing challenge will be to deter- 
mine the approaches and technology 
that really work, and to do it in a timely 
manner. 
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A standard two-dimensional (2-D) protein map of Fischer 344 rat liver 
(F544MST3) is presented, with a tabular listing of more than 1200 protein species. 
Sodium dodecyl sulfate (SDS) molecular mass and isoelectric point have been es- 
tablished, based on positions of numerous internal standards. This map has been 
used to connect and compare hundreds of 2-D gels of rat liver samples from a va- 
riety cf studies, and forms the nucleus of an expanding database describing rat 
liver proteins and their regulation by various drugs and toxic agents. An example 
of such a study, involving regulation of cholesterol synthesis by cholesterol-lower- 
ing crvgs and a high-cholesterol diet, is presented. Since the map has been ob- 
tained with a widely used and highly reproducible 2-D gel system (the Iso-Dalt* 
system), it can be directly related to an expanding body of work in other laborato- 
ries. 
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High-resoJution two-dimensional electrophoresis of pro- 
teins, introduced in 1975 by O'Farrell and others [1-4], has 
been used over the ensuing 16 years to examine a wide va- 
riety of biological systems, the results appearing in more 
than 5000 published papers. With the advent of computer- 
ized systems for analyzing two-dimensional (2-D) gel ima- 
ges and constructing spot databases, it is also possible to 
plan and assemble integrated bodies of information de- 
scribing the appearance and regulation of thousands of pro- 
tein gene products [5, 6]. Creating such databases involves 
amassing and organizing quantitative data from thousands 
of 2-D gels, and requires a substantial commitment in tech- 
nology and resources. 

Given the long-term effort required to develop a protein da- 
tabase, the choice of a biological system takes on consider- 
able importance. While in vitro systems are ideal for answer- 
ing many experimental questions, especially in cancer re- 
search and genetics, our experience with cell cultures and 
tissue samples suggests that some in vivo approaches could 
have major advantages. In particular, we have noticed that 
liver tissue samples from rats and mice appear.to show grea- 
ter quantitative reproducibility (in terms of individual pro- 
tein expression) than replicate cell cultures. This is perhaps 
a natural result of the homeostasis maintained in a com- 
plete animal vj. the well-known variability of cell cultures, 
the latter due principally to differences in reagents (e.g., 
fetal bovine serum), conditions (e.g., pH) and genetic "evo- 
lution" of cell lines while in culture. It is also more difficult 
to generate adequate amounts of protein from cell culture 
systems (particularly with attached cells), forcing the inves- 
tigator to resort to radioisotope-based or silver-based stain- 
detection methods. While these methods are more sensi- 
tive (sometimes much more sensitive) than the Coomassie 
Brilliant Blue (CBB) stain typically used for protein detec- 
tion in "large" protein samples, they are generally more vari- 
able, more labor-intensive and, in the case of radiographic 
methods, may generate highly "noisy" images, due to the 
properties of the films used. By contrast, large protein sam- 
ples can easily be prepared from liver using urea/Nonidet 
P-40 (NP-40) solubilization and stained with CBB, which 
has the advantage of being easily reproducible [8J. Finally, 
there remains the question of the truthfulness* of many in 
vitro systems as compared to their in vivo analogs; h w 
great are the changes caused by the introduction into a cul- 
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turc and the associated shift to strong selection for growth, 
and how do these aff ct experimental outcomes? Hence 
the apparent advantages of in vitro systems, in terms of ex- 
perimental manipulation, may be counterbalanced by 
other factors relating to 2-D data quality. 

There is a second important class of reasons for exploring 
the use of an in vivo biological system such as the liver. His- 
torically, there have been two broad approaches to the me- 
chanist c dissection of biochemical processes in intact eel- 

*™™ s (a sear f, for info : m K al,vc T an ?? 

3 lh c use of chemical agents (drugs and chemical toxins). 
Both approaches help us to understand complex systems 
by disrupting some specific functional element and show- 
ine us the result. With the development of techniques for 
Eenetic manipulation and cloning, the genetic approach 
can be effectively applied either in vitro or in vivo, although 
the in vitro route is usually quicker The chemical approach 
can also be applied to either sort of biological system; here, 
however the bulk of consistently acquired information is 
in experimental animals (rats and mice). While most biolo- 
cists know a short list of compounds having specific, experi- 
mentally useful effects (e.g., inhibitors of protein synthesis, 
ionophores, polymerase inhibitors, channel blockers, nu- 
cleotide analogs, and compounds affecting polymen2ation 
of cytoskelettl proteins), there is a much larger number of 
interesting chemically-induced effects, most of them char- 
acterized by toxicologists and pharmacologists in rodent 
systems. Just as a thorough genetic analysis would involve 
saturating a genome with mutations, it is possible to ima- 
gine a saturating number of drugs, the analysis of whose ac- 
tions would reveal the complete biochemistry of the cell. 
While organized drug discovery efforts usually target spe- 
cific desired effects, the nature of the process, with its de- 
pendence on screening large numbers of compounds, ne- 
cessarily produces many unanticipated effects. It is there- 
f re reasonable to suppose that the required broad range of 
compounds necessary to achieve -biochemical saturation" 
may be forthcoming; in fact, it may already exist among the 
hundreds of thousands of compounds that failed to qualify 
as drugs. 

Among organs, the liver is an obvious choice for the study 
f chemical effects because of its well-known plasticity and 
responsiveness. The brain appears to be quite plastic (e.g. 
[7]), but it is a complicated mixture of cell types requiring 
skillful dissection for most experiments. The kidney, while 
quite responsive, also presents a potentially confounding 
mixture of cell types. The liver, by contrast, is made up of 
one predominant cell type which is easy to solubilize: the 
hepatocyte, representing more than 95% of its mass. Most 
importantly, the liver performs many homeosiatic func- 
tions that Tequirc rapid modulation of gene expression. It 
appears that most chemical agents tested affect gene ex- 
pression in the liver at some dosage (N. Leigh Anderson, 
unpublished observations), an interesting contrast to our 
earlier work with lymphocytes, for example, which seem to 
be much less responsive. Such results conform to the expec- 
tation that ceils with a homeostatic, physiological role 
should be more plastic than cells differentiated for a pur- 
pose dependent on the action of a limited number of spe- 
cific genes. 

The liver also allows the parallels between in vitro and in 
vivo systems to be examin d in detail. Significant progress 
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has been made in the development of mouse, rat and hu- 
man hepatocyte culture systems, as well as in precision-cut 
tissue slices/Using such an array of techniques, it is posri- -« 
ble to assemble a matrix of mammalian systems including 
mouse and rat in vivo on one level and mouse, rat and hu- 
man in vitro on a second level, and to compare effects be- 
tween species and between systems. This approach allows 
us to draw informed conclusions regarding the biochemical 
"universality'' of biological responses among the mammals, 
and to offer some insight into the validity of in vitro ap- 
proaches for toxicological screening. We believe this data 
will be necessary if in vitro alternatives are to achieve wide 
usage in government-mandated safety testing of drugs, con- 
sumer products and industrial and agricultural chemicals. 

A number of interesting studies have been published using 
2-D mapping to examine effects in the rodent liver. A num- 
ber of investigarors have made use of the technique to 
screen for existing genetic variants 18—1 1] or induced muta- 
tions [12-14], mainly in the mouse.This work builds on the 
wealth of genetic information available on the mouse and 
its established position as a mammalian mutation-detec- 
tion system. While some studies of chemical effects have 
been undertaken in the mouse [15-17], most have used the 
rat [18-23]. The examination of the cytochrome p-450 sys- 
tem, in particular, has been carried out almost exclusively 
on the rat [24, 25]. 

« 

These considerations lead us to conclude that rodent liver 
offers the best opportunity to systematically examine an 
array of gene regulation systems, and ultimately to build a 
predictive model of large-scale mammalian gene control. 
The basic underlying foundation of such a project is a reli- 
able, reproducible master 2-D pattern of liver, to which on- 
going experimental results can be referred. In this paper, we 
report such a master pattern for the acidic and neutral pro- 
teins of rat liver (pattern F344MST3).ln future, this master 
will be supplemented by maps of basic proteins, and analog- 
ous maps of mouse and human liver. 



2 Materials and methods 
2.1 Sample preparation 

Liver is an ideal sample material for most biochemical stud- 
ies, including 2-D analysis. A sample is taken of approxima- 
tely 0.5 g of tissue from the apical end of the left lobe of the 
liver. Solubilization is effected as rapidly as practical; a 
delay of 5-15 min appears to cause no major alteration in 
liver protein composition if the liver pieces are kept cold 
(e.g., on ice) in the interim. In the solubilization process, 
the liver sample is weighed, placed in a glass horoogenizer 
(e.g., 15 mLV/heaton); 8 volumes of solubiiizing solution* 

• The solubilizing solution is composed of 2 % NP-40 (Sigma), 9 m urea 
(analytical grade, e.g., BDH or Bio-Rad), 0.5% d.thiothreitol (DTT; 
Sigma) and 2% carrier ampholytes (pH 9-11 LKB: these come as a 20% 
stock solution, so 2 % final concentration is achieved by making the final 
solution 10% 9-11 Ampholine by volume). A large batch orsolubilixer 
(several hundred ml) is made and stored frozen at -80°C in aliquou 
sufficient to provide enough Tor one day's estimated sample prepara- 
tion requirement. The solution is never allowed to become warmer 
than room temperature at any stage during preparation or thawing for 
use, since heating of concentrated urea solutions can produce contami- 
nants that covalenlly modify proteins producing artiractua! charge 
shifts. Once thawed, any unused solubiliier is discarded. 
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is added (i.e., A mL per 0.5 g tissue) and ihe mixture is ho- 
mogenized using first the loose- and then then the light-fit- 
ting glass pestle. This takes approximately 5 strokes with 
each pestle and is carried out at room temperature because 
urea would crystallize out in the cold. Once the liver sample 
is thoroughly homogenized in the solubilizer, it is assumed 
that all tbe proteins are denatured (by the chaotropic effect 
of the urea and NP-40 detergent) and the enzymes inacti- 
vated by the high pH (-9 .5). Therefore these samples may 
be kept at room temperature until they can be centrifuged 
or frozen as a group (within several hours of preparation). 
The samples are centrifuged for 6 X 10* gmin (e.g., 500 000 
X g for 12 min using a Beckman TL-100 centrifuge). The 
centrifuge rotor is maintained at just below rocm tempera- 
ture (e.g., 15-20°C), but not too cold, so 2S to prevent the 
precipitation of urea. The centrifuge of choice is a Beckman 
TL-100 because of the sample tube sizes available, but anv 
ultracentrifuge accepting smallish tubes wijj suffice. When 
an appropriate centrifuge is not available near the site of 
sample preparation, samples can be frozen at — 80 °C and 
thawed prior to centrifugation and collection of superna- 
tants.Each supernatant is carefully removed following cen- 
trifugation and aliquoted into at least A clean tubes for stor- 
age. This is done by transferring all the supernatant to one 
clean tube, mixing this gently (to assure homogeneous 
composition) and then dividing it into A aliquots.The aJi- 
quots are frozen immediately at -80°C. These multiple ali- 
quots can provide insurance against a failed run or a freezer 
breakdown. 

22 Two-dimensional electrophoresis 

Sample proteins are resolved by 2-D electrophoresis using 
the 20 X 25 cm Iso-Dalt* 2-D gel system ([26-29]; pro- 
duced by LSB and by Hoefer Scientific Instruments', San 
Francisco) operating with 20 gels per batch. All first-dimen- 
sional isoelectric focusing (1EF) gels are prepared using the 
same single standardized batch of carrier ampholytes 
(BDH 4-8A in the present case, selected by LSB's batch- 
testing program for rat and mouse database work**). A 10 
uL sample of solubilized liver protein is applied to each gel, 
and the gels are run for 33 000 to 34500 volt-hours using a' 
progressively increasing voltage protocol implemented by 
a programmable high-voltage power supply. An Ange- 
lique" computer-controlled gradient-casting system (pro- 
duced by LSB) is used to prepare second-dimensional sod- 
ium dodecyl sulfate (SDS) polyacrylamide gradient slab 
gels in which the top 5 % of the gel is 1 1 %T acrylamide, and 
the lower 95% of the gel varies linearly from 1*1% to 18%T. 

This system has recently been modified so as to employ a 
commercially available 30.8 %T acrylamide/A r ,A"-methyle- 
nebisacrylamide prepared solution (thus avoiding the han- 
dling of the solid acrylamide monomer) and three addi- 
tional stock solutions: buffer (made from Sigma pre-set 
Tris), persulfate and AW^AMetramethylethylenedi- 
araine (TEMED). Each gel is identified by a computer- 
printed filter paper label polymerized into the lower left cor- 
ner f the gel. First-dimensional IEF tube gels are loaded 



•* This material (succeeding certified batches of which are available from 
Hoefer Scientific Instruments) has ihe most linear pH gradient pro- 
duced by any ampholyte tested except for the Pharmacia wide range 
(which has an unacceptable tendency to bind high-molecular weight 
acidic proteins, causing them to streak). 



cirectly (as extruded) onto the slab gels without equUibra- 
uon, and held in place by polyester fabric wedges (Wed- 
gies*, produced by LSB) to avoid the use of hot agarose 
Second-dimensional slab gels are run overnight, in groups 
of20,iu cooled D ALT tanks (lO'Q with buffer circulation. 
All run parameters, reagent source and lot information, 
and notations of deviation from expected results are ente- 
red by tbe technician responsible on a detailed, multi-page 
record of tbe experiment. 



23 Staining 

Following SDS-electrophoresis, slab gels are stained for 
protein using a colloidal Coomsssie Blue G-250 procedure 
in covered plastic boxes, with 10 gels (totalling approxima- 
tely 1 L of gel) per box. This procedure (based on the work 
of Neuhoff [30, 31]) involves fixation in 1.5 L of 50% etha- 
nol and 2% phosphoric acid for 2h, three 30 min washes, 
each in 2 L of cold tap water, and transfer to 1.5 L of 34% 
methanol, 17% ammonium sulfate and 2% phosphoric acid 
for 3 h, followed by the addition of a gram of powdered Coo- 
massie Blue G-250 stain. Staining requires approximately 4 
days to reach equilibrium intensity, whereupon eels are 
transferred to cool tap water and their surfaces rinsed to re- 
move any particulate slain prior to scanning. Gels may be 
kept for several months in water with added sodium azide. 
The water washes remove ethanol that would dissolve the 
stain (and render the system noncolloidal, with high back- 
grounds). The concentrated ammonium sulfate and meth- 
anol solution is diluted by equilibration with the water vol- 
ume of the gels to automatically achieve the correct final 
concentrations for colloidal staining. Practical advantages 
of this staining approach can be summarized as follows: (i) 
the low, flat background ma);es computer evaluation of 
small spots (max OD < 0.02) possible, especially when 
using laser densitometry; (ii) up to 1500 spots can be reli- 
ably detected on many gels (e.g., rat liver) at loadings low 
enough to preserve excellent resolution; and (iii) reprodu- 
cibility appears to be very good: at least several hundred 
spots have coefficients of reproducibility less than 15% 
This value is at least as good as previous CBB methods, and 
significantly better than many silver stain systems. 

2.4 Positional standardization 

The carbamylated rabbit muscle creatine phosphokinase 
(CPK) standards [32] are purchased from Pharmacia and 
BDH. Amino acid compositions, and numbers of residues 
present in proteins used for internal standardization, are 
taken from the Protein Identification Resource (PIR) se- 
quence database [33]. 



2.5 Computer analysis 

Stained slab gels are digitized in red light at 134 micron re- 
solution, using either a Molecular Dynamics laser scanner 
(with pixel sampling) or an Eik nix 78/99 CCD scanner. 
Raw digitized gel images are archived n high-density DAT 
tape (or equivalent storage media) and a grayscale video- 
print prepared from the raw digital image as hard-copy 
backup of the gel image. Gels are processed using the Kep- 
ler* software system (produced by LSB), a c mmercially 
available workstati n-based software package built on 
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some of the principles of the earlier TYCHO system [34- 
411. Procedure PROC008 is used 10 yield a spotlist giving 
position, shape and density information for each detected 
spot. This procedure makes use of digital filtering, mathe- 
' matical morphology techniques and digital masking to re- 
move the background, and uses full 2-D least-squares opti- 
mization to refine the parameters of a 2-D Gaussian shape 
for each spot. Processing parameters and file locations are 
stored in a relational database, while various log files detail- 
ing operation of the automatic analysis software are ar- 
chived with the reduced data.The computed resolution and 
level of Gaussian convergence of each gel are inspected 
and archived for quality control purposes. 

Experiment packages are constructed using the Kepler ex- 
periment definition database to assemble groups of 2-D 
patterns corresponding to the experimental groups (e.g., 
treated and control animals). Each 2-D pattern is matched 
to the appropriate 'master* 2-D pattern (pattern 
F344MST3 in the case of Fischer 344 rat liver), thereby 
providing linkage to the existing rodent protein 2-D data- 
bases. The software allows experiments containing hun- 
dreds of gels to be constructed and analyzed as a unit, with 
up to 100 gels displayed on the screen at one time for com- 
parative purposes and multiple pages to accommodate ex- 
periments of > 1000 gels. For each treatment, proteins 
showing significant quantitative differences vs. appropriate 
controls are selected using group-wise statistical parame- 
ters (e.g., Student's t-test, Kepler* procedure STUDENT). 
Proteins satisfying various quantitative criteria (such as P< 
0.001 difference from appropriate controls) are repre- 
sented as highlighted spots onscreen or on computer-plot- 
ted protein maps and stored as spot populations (/.*., logi- 
cal vectors) in a liver protein database. Quantitative data 
(spot parameters, statistical or other computed values) are 
stored as real- valued vectors in the database. Analysis of co- 
regulation is performed using a Pierson product-moment 
correlation (Kepler procedure CORREL) to determine 
whether groups of proteins are coordinately regulated by 
any of the treatments. Such groups can be presented graphi- 
cally on a protein map, and reported together with the statis- 
tical criteria used to assess the level of coregulation. Multi- 
variate statistical analysis {e.g., principal components' ana- 
lysis) is performed on data exported to SAS (SAS Institute). 

2.6 Graphical data output 

Graphical results are prepared in GKS and translated 
within Kepler* into output for any of a variety of devices. 
Linedrawing output is typically prepared as Postscript and 
printed on an Apple LaserWriter. Detailed maps presented 
here have been generated using an ultra-high-resolution 
Postscript-compatible Linotronic output device. Greyscale 
graphics are reproduced from the workstation screen using 
a Seikosba videoprinter. Patterns are shown in the standard 
orientation, with high molecular mass at the top and acidic 
proteins to the left. 

2.7 Experiment LSBC04 

In the study described here 12-week-old Charles River 
male F344 rats were used. Diets were prepared at LSB, 
based on a Purina 5755M Basal Purified Diet. Lovastatin 
and cholestyramine were obtained as prescription pharma- 



ceuticals, ground and mixed with the diet at concentrations 
of 0.075 % and 1 %, respectively. The high cholesterol diet 
was Purina 5801M-A (5% cholesterol plus 1 % sodium cho- 
late in the control diet). Animal work was carried out by Mi* 
crobiological Associates (Betbesda,MD). Animals were to 
cliinatized for one week on the control diet, fed test or con* 
trol diets for one week, and sacrificed on day 8. Average 
daily doses of lovastatin and cholestyramine in appropriate 
groups were 37 mg/kg/day and 5 g/kg/day, respectively, 
based on the weight of the food consumed. Liver samples 
were collected and prepared for 2-D electrophoresis accord- 
ing to the standard liver protocol (homogenization in 8 
volumes of 9 m urea, 2% NP-40, 0.5% dithiothreitol, 2% 
LKB pH 9-11 carrier ampholytes, followed by centrifuga- 
lion for 30 min at 80000 X g). Kidney, brain and plasma 
samples were frozen. Gels were run as described above, 
and the data was analyzed using the Kepler* system. Gels 
were scaled, to remove the effect of differences in protein 
loading, by setting the summed abundances of a large num- 
ber of matched spots equal for each gel (linear scaling). 



3 Results and discussion 

3.1 The rat liver protein 2-D map 

F344MST3 is a standard 2-D pattern of rat liver proteins, 
based on the Fischer 344 strain. This pattern was initiated 
from a single 2-D gel and extensively edited in an experi- 
ment comparing it to a range of protein loads, so as to in- 
clude both small spots and well-resolved representations of 
high-abundance spots. More than 700 rat liver 2-D patterns 
have been matched to F344MST3 in a series of drug effects 
and protein characterization experiments, and numerous 
new spots (induced by specific drugs, for instance) have 
been added as a result. A modified version including addi- 
tional spots present in the Sprague-Dawley outbred rat has 
also been developed (data not shown). Figure 1 shows a 
greyscale representation and Fig. 2 a schematic plot of the 
master pattern. More than 1200 spots are included, most of 
which are visible on typical gels loaded with 10 uLof solubi- 
lized liver protein prepared by the standard method and 
stained with colloidal Coomassie Blue. Master spot num- 
bers (MSN's) have been assigned to all proteins, and ap- 
pear in the following figures, each showing one quadrant of 
the pattern. Figure 3 shows the upper left (acidic, high 
molecular mass) quadrant, Fig. 4 the upper right (basic, 
high molecular mass) quadrant, Fig. 5 the lower left (acidic, 
low molecular mass) quadrant, and Fig. 6 the lower right 
(basic, low molecular mass) quadrant. The quadrants over- 
tap as an aid to moving between them. The gel position (in 
100 micron units), isoelectric point (relative to the CPK in- 
ternal pJ standards) and S DS molecular mass (from the cali- 
bration curve in Fig. 8) are listed for each spot (Table 1). Be- 
cause of the precision of the CPK-p/ values, these parame- 
ters can be used to relate spot locations between gel sys- 
tems more reliably than using p/ measurements expressed 
as pH. A major objective of current studies is the identifica- 
tion of all major spots corresponding to known liver pro- 
teins, as well as rigorous definitions of subcellular orga- 
nelle contents. Of particular interest to us is the parallel de- 
velopment of identifications in the rat and mouse liver 
maps, allowing detailed comparisons of gene expression ef- 
fects in the two systems. The results of these studies will be 
presented systematically in a later edition of this database, 
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but we include here a useful series of 22 orienting identifi- 
cations as an aid to other users of the rat liver pattern (Table 



32 Carbamylated charge standards, computed p/s and 
molecular mass standardization 

We have previously shown that the use of a system of close- 
ly-spaced internal pi markers (made by carbamylaiing a 
basic protein) ofTers an accurate and workable solution to 
the problem of assigning positions in the p7 dimension (32). 
The same system, based on 36 protein species made by car- 
bamylating rabbit muscle CPK, has been used here to as- 
sign pfs to most rat liver acidic and neutral proteins. The 
standards were coelectrophoresed with total liver proteins, 
and the standard spots added to a special version of the 
master pattern F344MST3. The gel -^-coordinates of all 
liver protein spots lying within the CPK charge train were 
then transformed into CPK p/ positions by interpolation 
between the positions of immediately adjacent standards 
(Table 1) using a Kepler* vector procedure. 

It has proven possible to compute fairly accurate p/ values 
for many proteins from the amino acid composition [42). 
We have attempted here to test a further elaboration of this 
approach, in which we computed p/s for the CPK standards 
themselves, based on our knowledge of the rabbit muscle 
CPK sequence and the fact that adjacent members of the 
charge train typically differ by blockage of one additional ly- 
sine residue (Table 3). We compared these values to similar 
computed p/s for an additional set of carbamylated stand- 
ards made from human hemoglobin beta chains and a se- 
ries of rat liver and human plasma proteins of known posi- 
tion and sequence (Fig. 7,Table 4). The resuli demonstrates 
good concordance between these systems. Two proteins 
show significant deviations: liver fatty-acid binding protein 
(FABP; #1 in Table 4) and protein disulphide isomerase 
(*20 in the table). The FABP spot present on F344MST3 
may represent a charge-modified version of a more basic 
parent spot closer to the expected p/ t not resolved in the 
1EF/SDS gel. Of particular importance is the fact that, by 
comparing computed p/s of sequenced but unlocated pro- 
teins with the CPK p/s, we can assign a probable gel loca- 
tion without making any assumptions regarding the actual 
gel pH gradient. This ofTers a useful shortcut, given the va- 
garies of pH measurement on small diameter IEF gels. We 
have used this approach to compute the CPK p/s of all rat 
and mouse proteins in the P1R sequence database, as an aid 
to protein identification (data not shown). 

In order to standardize SDS molecular weight (SDS-MW), 
we have used a standard curve fitted to a series of identified 
proteins (Fig. 8). Rather than using molecular mass per jc, 
we have elected to use the number of amino acids in the 
polypeptide chain, as perhaps a better indication of the 
length of the SDS-coated rod that is sieved by the second 
dimension slab. The resulting values were multiplied by 
112 (the weighted average mass of amino acids in se- 
quenced proteins) to give predicted molecular masses. Be- 
cause we use gradient slabs, we have not constrained the fit- 
ted curve to conform to any predetermined model; rather 
we tried many equations and selected the best using the 
program Tablecurve" on a PC. The equation chosen was> 
« a + bx+ c/x, where vis the number of residues,* is the gel 
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^coordinate, a is 51 1.83, b is-0\2731 and cis 33183801.1he 
resulting fit appears to be fairly good over a broad range of 
molecular mass. 



33 An example of rat liver gene regulation: Cholesterol 
metabolism 

Experiment LSBC04 was designed as a smalj-scale test of 
the regulation of cholesterol metabolism in vivo by three 
agents included in the diet: lovastatin (Mevacor*,an inhibi- 
tor of HMG-CoA reductase); cholestyramine (a bile acid 
sequestrant that has the effect of removing cholesterol 
from the gut-liver recirculation); and cholesterol itself. The 
first two agents should lower available cholesterol and the 
third should raise it, allowing manipulation of relevant 
gene expression control systems in both directions. Such 
an experiment offers an interesting test of the 2-D mapping 
system since most of the pathway enzymes are present in 
low abundance, many are membrane-bound and difficult 
to solubilize, and the pathway itself is complex. Approxima- 
tely 3000 proteins were separated and detected in liver ho- 
mogenates. Twenty-one proteins were found to be affected 
by at least one treatment, and these could be divided into 
several coregulated groups. 

3.3.1 MSN 413 (putative cytosolic HMG-CoA synthase) 
and sets of spots regulated coordinated or Inversely 

One group of spots (including a spot assigned to the cyto- 
solic HMG-CoA synthase, MSN413) showed the expected 
increase in abundance with lovastatin or cholestyramine, 
the synergistic further increase with lovastatin and choles- 
tyramine, and a dramatic decrease with the high cholesterol 
diet. Spot number 413 is the most strongly regulated pro- 
tein in the present experiment, showing a 5- to 10-fold in- 
duction after a 1 week treatment with 0.075% lovastatin and 
1 % cholestyramine in the diet (Figs. 9 and 10). Its expres- 
sion follows precisely the expectation for an enzyme whose 
abundance is controlled by the cholesterol level; it is pro- 
gressively increased from the control levels by cholestyra- 
mine, lovastatin and lovastatin plus cholestyramine, and it 
sinks below the threshold of detection in animals fed the 
high cholesterol diet. This spot has been tentatively identi- 
fied as the cytosolic HMG-CoA synthase, based on a reac- 
tion with an antiserum to that protein provided by Dr. Mi- 
chael Greenspan at Merck Sharp & Dohme Research Labo- 
ratories, This enzyme lies immediately before HMG-CoA 
reductase in the liver cholesterol biosynthesis pathway,and 
is known to be co-regulated with it. Spot 413 has an SDS 
molecular weight of about 54 000 and a CPK p/of-11.4,in 
reasonably close agreement with a molecular weight of 
57300 and a CPK p/ of -15.7 computed from the known se- 
quence of the hamster enzyme (43J. 

Using a classical product-moment correlation test (Kepler 
procedure CORREL), a series of five additional spots was 
found to be coregulated with 413. The level of correlation 
was exceedingly high (> 95%). Two of these, 1250 and 933, 
are at similar molecular weights and approximately one 
charge more acidic than 413 (Fig. 9), indicating that they 
may be covalently modified forms of the 413 polypeptide. 
This suspicion is strengthened by the observation that b th 
spots are also stained by the antibody to cytosolic HMG- 
CoA synthase. The remaining three correlated spots appear 
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to comprise an additional related pair (1253 and 1001) of 
around 40 kDa and a single spot (1129) of around 28 kDa. 
Because these two presumed proteins are present at sub- 
stantially lower abundances than 413, and because the cyto- 
solic HMG-CoA synthase is reported to consist of only one 
type of polypeptide, they are likely to represent other, very 
tightly coregulated enzymes. A second group of six spots 
was selected based on a regulatory pattern close to the in- 
verse of that for spot 413 (MSN's 34,79, 178, 182.204,347; 
data not shown). For these proteins, the lowest level of ex- 
pression occurs with exposure to lovsstatin plus cholestyra- 
mine and the highest level upon exposure to the high-cho- 
lesterol diet. Spots 182 and 79 are highly correUieo and lie 
about one charge apan at the same molecular weight; they 
may thus be isoforms of a single protein. The other four 
spots probably represent additional enzymes or subunits. 

332 MSN 235 and coregulated spots 

A third group of five spots, mainly comprised of mitochon- 
drial proteins including putative mitochondrial HMG- 
CoA synthase spots, showed a modest induction by lovasta- 
tin alone, but little or no efTect with any of the other treat- 
ments (including the combination of lovasiatin and choles- 
tyramine; Fig. 12).This result is intriguing because iovasta- 
tin was expected to affect only the regulation of enzymes of 
cholesterol synthesis, which is entirely extra-mitochon- 
drial. Three of the spots (235, 134, 144) form a closely- 
packed triad at approximately 30 kDa, and are likely to re- 
present isoforms of one protein. All three spots are stained 
by an antibody to the mitochondrial form of HMG-CoA 
svnthase obtained from Dr. Greenspan. Subcellular fractio- 
nation indicates a mitochondrial location. The other two 
spots (633 at about 38 kDa and 724 at about 69 kDa) are 
ach present at lower abundance than the members of the 
triad. 



proteins of the putative mitochondrial pathway are so 
much more variable in their expression in all groups. An ex* 
amination of ail the coregulated groups suggests that quan- 
titative statistical techniques can extract a wealth of inter- 
esting information from large sets of reproducible gels. The 
abundance of spots in the413coregu!ationgroup,for exam- 
ple, shows an amazing level of concordance in their relative 
expression among the five individuals of the lovastatin and 
cholestyramine treatment group. This efTect is not due to 
differences in total protein loading, since they have already 
been removed by scaling, and since proteins with quite dif- 
ferent regulation patterns can be demonstrated (e.g.. Fig. 
13). Such efTecis raise the possibility that many gene coregu- 
lation sets may be revealed through the study of a suffi- 
ciently large population of control animals (i.e., without 
any experimental manipulation). This approach, exploiting 
natural biological variation in protein expression instead of 
drug effects, offers an important incentive for the construc- 
tion of a large library of control animal patterns. 



4 Conclusions 

Because of the widespread use of rat liver in both basic bio- 
chemistry and in toxicology, there is a long-term need for a 
comprehensive database of liver proteins. The rat liver mas- 
ter pattern presented here has proven to be an accurate re- 
presentation of this system, having been matched to more 
than 700 gels to date. As the number of proteins identified 
and the number of compounds tested for gene expression 
effects grows, we expect this database to contribute valu- 
able insights into gene regulation. Its practical utility in sev- 
eral areas of mechanistic toxicology is already being de- 
monstrated. 

Received September 11, 1991 



333 An example of an anti-synergistic effect 

A sixth spot (367) shows strong induction by lovastatin 
(two- to threefold), and about half as much induction with 
lovasiatin plus cholestyramine, but without sharing the ani- 
mal-animal heterogeneity pattern of the 235-set (Fig. 13). 
This pr tein is also mitochondrial, and represents the clear- 
est example of an anti-synergistic efTect of lovastatin and 
cholestyramine. The existence of such an efTect demon- 
strates that lovastatin and cholestyramine do not act exclu- 
sively through the same regulatory pathway. 

3 3.4 Complexity of the cholesterol synthesis pathway 

Taken together, these results suggest that treatment with lo- 
vastatin alone can affect both cytosolic and mitochondrial 
pathways using HMG-CoA, while cholestyramine, on the 
other band, either alone or in combination with lovastatin, 
produces a strong effect on the putative cytosolic pathway, 
but little or no effect on the putative mitochondrial path- 
way. An explanation for this difference may lie in lovasta- 
tin's effect on levels of HMG-CoA and related precursor 
compounds that are exchanged between the cytosol and 
the mitochondrion, whereas cholestyramine should affect 
only the cytosolic pathways directly controlled by cholester- 
ol and bile acid levels.lt remains to be explained why some 
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Figurr J. Synthetic representation of the standard rat liver 2-D master pattern, rendered as a greyscale image using a videoprinter 
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/fratr* J. Schematic representation of the master pattern (the same as Fig. 1), useful as an aid in relating specific areas of Fig. 1 and the followjns detailed 
quadrants. 
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Figurt 3. Upper left (high molecular weight, acidic) quadrant (#1) of the rat liver map, showing spot numbers. 



« 

* Eitarophitiii 1991, 12i 90V9JO 



Daubatc of rat ltm proteins 917 



2 



0 ns qij7 QiM Qt^g^gitt^gym 



JlfO 



i 

i 

j 



0" 




?3a 



0 s ** — 



0*017 



174 




c^T 7 ^c^ 7 O^iu 



©7= 



10 ot0*9 Q|3 





dm 



to 




©IKS 




! cf 77 olMO 



is 



0*° 



O 1 " o»N 




0*9* 



©n» 

i 



Jose 





in 




31001 



©10*4 



21 




M0 



0«3 




0" 5 



si 




O 453 




©iiio 



(-^57 O* 30 

0 534 c O W2 o*« 




019 



o 510 

OJ13 O ,, °0* 




figure Upper right (high molecular weight, basic) quadrant (#2) of the rat liver map, showing spot numbers. 
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4 figure 7. (a) Plot of computed isoelectric point versus gel J-position for 
two sets of carbamylated standard proteins (rabbit muscle CPK [+J and 
human hemoglobin 0 chain, filled diamonds) and several other proteins 
(shaded squares), (b) The identities of the various proteins represented 
by the squares are indicated by the numbers in corresponding positions 
on (a); these refer to Table 4. 
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fi*w/T 9. Montage showing efTecu in the 
region ofMSN:413.The monuge showsa 
small window into one portion of the 2-D 
pattern, one row of windows for each expe- 
rimental group, and one panel for each gel 
in the experiment. The left-most pattern 
in each row is a group-specific copy of the 
master pattern followed by the patterns 
for the five individual rats in the group. 
The highlighted protein spots (filled circ- 
les) are spot 413 (on the right of each pan- 
el; identified as cytosolic HMG-CoA syn- 
thase) and two modified forms of it (1250 
and 933). From the top, the rows (experi- 
mental groups) are: high cholesterol, con- 
trols, cholestyramine, lovastatin, and lova- 
statin plus cholestyramine. 
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Figure 10. Bargraph showing the quantita- 
tive effects or various treatments on the 
abundance orMSN:413 (cytosolic HMG- 
CoA synthase) in the gels of Fig. 9. 




Figure J J. Bargraphs of a series of six core* 
gulated spots including MSN:413. In the 
bargraphs, the abundances of the appro* 
priate spot (master spot number shown st 
the top of the panel) in each animal are 
shown. The five five-animal groups are in 
the order (left to right): high cholesterol, 
controls, cholestyramine, lovastatin. and 
lovastatin plus cholestyramine. Each bar 
within a group represents one experimen- 
tal animal liver (one 3-D gel). Note the cor* 
related expression of the 6 spots, espe- 
cially in the two far right (most strongly in- 



duced) groups. 
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Figure 12. Data on a second coregulated 
group of spots, presented as in Fig. 1 l.The 
fourth experimental group (lovastatin) 
shows a modest induction, while the fifth 
group (lovastatin plus cholestyramine) 
does not. 



Figure 13, Data on spot MSN:367, presented as in Fig. U.This protein 
shows unambiguously the anti-synergistic e fleet of lovastatin and choles- 
tyramine (fifth group) as compared to lovisiatin (fourth group).This res- 
ponse contrasts strongly with the regulation pattern seen in Fig. 11. 
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Tabic 1. Master table of proteins in the rat liver database" 
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Tabic 3. Computed pfs of two sets of carbamylated protein standards: Rabbit muscle CPK and human 
hemoglobin (Hb) 

PIR #ASP #GLU #HIS #LYS #ARG NH2- Calc Real 
Name 
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Table 4. Computed pfs of some known proteins related to measured CPK pfs 
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Protein Name 

Creatine phospho kinase (CPK), rabbit muscle 
Fatty arid-binding protein, rat hepatic 
b2-microglobu6n, human 
Cartamoyl-phosphate synthase, rat 
Prealbumin ( serum albumin precursor), rat 
Serum albumin, rat 

Superoxid dismutase <Cu-Zn, SOD), rat 

Phospholipase C, phophoinositide-specific (?). rat 

Albumin, human 

Apo A-l lipoprotein, rat 

proApo A- 1 lipoprotein, human 

NADPH cytochrome P-450 reductase, rat 

Betinol binding protein, human 

Actin beta, rat 

Actin gamma, rat 

Apo A-l lipoprotein, human 

Apo A-IV lipoprotein, human 

Tubulin alpha, rat 

FiATPase beta, bovine 

Tubulin beta, pig 

Protein disulphide isomerase (PDI), rat hepatic 

Cytochrome b5, rat 

Apo C-li lipoprotein, human 



PiR 

Name 

KIRBCM 
F2RTL 
MGHUB2 
SYRTCA 
ABRTS 
ABRTS 
A26810 
A28807 
ABHUS 
A24700 
LPHUA1 
RDRT04 
VAHU 
ATRTC 
ATRTC 
LPHUA1 
LPHUA4 
UBRTA 
PWBOB 
UBPGB 
ISRTSS 
CBRT5 
LPHUC2 



*ASP #GLU KHIS «LYS KARG Calc Real 
3.9 4.1 6.0 10J 12-5 pi CPK 
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Amino acid pi assumed in calulation: 
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